6,330 research outputs found

    Basis Expansions for Functional Snippets

    Full text link
    Estimation of mean and covariance functions is fundamental for functional data analysis. While this topic has been studied extensively in the literature, a key assumption is that there are enough data in the domain of interest to estimate both the mean and covariance functions. In this paper, we investigate mean and covariance estimation for functional snippets in which observations from a subject are available only in an interval of length strictly (and often much) shorter than the length of the whole interval of interest. For such a sampling plan, no data is available for direct estimation of the off-diagonal region of the covariance function. We tackle this challenge via a basis representation of the covariance function. The proposed approach allows one to consistently estimate an infinite-rank covariance function from functional snippets. We establish the convergence rates for the proposed estimators and illustrate their finite-sample performance via simulation studies and two data applications.Comment: 51 pages, 10 figure

    Algorithms for pre-microrna classification and a GPU program for whole genome comparison

    Get PDF
    MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpin can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (referred to as pseudo pre-miRNAs). The first part of this dissertation presents a new method, called MirID, for identifying and classifying microRNA precursors. MirID is comprised of three steps. Initially, a combinatorial feature mining algorithm is developed to identify suitable feature sets. Then, the feature sets are used to train support vector machines to obtain classification models, based on which classifier ensemble is constructed. Finally, an AdaBoost algorithm is adopted to further enhance the accuracy of the classifier ensemble. Experimental results on a variety of species demonstrate the good performance of the proposed approach, and its superiority over existing methods. In the second part of this dissertation, A GPU (Graphics Processing Unit) program is developed for whole genome comparison. The goal for the research is to identify the commonalities and differences of two genomes from closely related organisms, via multiple sequencing alignments by using a seed and extend technique to choose reliable subsets of exact or near exact matches, which are called anchors. A rigorous method named Smith-Waterman search is applied for the anchor seeking, but takes days and months to map millions of bases for mammalian genome sequences. With GPU programming, which is designed to run in parallel hundreds of short functions called threads, up to 100X speed up is achieved over similar CPU executions

    Feature extraction from ear-worn sensor data for gait analysis

    Get PDF
    Gait analysis has a significant role in assessing human's walking pattern. It is generally used in sports science for understanding body mechanics, and it is also used to monitor patients' neuro-disorder related gait abnormalities. Traditional marker-based systems are well known for tracking gait parameters for gait analysis, however, it requires long set up time therefore very difficult to be applied in everyday realtime monitoring. Nowadays, there is ever growing of interest in developing portable devices and their supporting software with novel algorithms for gait pattern analysis. The aim of this research is to investigate the possibilities of novel gait pattern detection algorithms for accelerometer-based sensors. In particular, we have used e-AR sensor, an ear-worn sensor which registers body motion via its embedded 3-D accelerom-eter. Gait data was given semantic annotation using pressure mat as well as real-time video recording. Important time stamps within a gait cycle, which are essential for extracting meaningful gait parameters, were identified. Furthermore, advanced signal processing algorithm was applied to perform automatic feature extraction by signal decomposition and reconstruction. Analysis on real-word data has demonstrated the potential for an accelerometer-based sensor system and its ability to extract of meaningful gait parameters
    corecore